# **Design of Low Power HT based CORDIC Architecture for Embedded Processors**

S. Varkeessheeba PG Scholar, Department of ECE, PSNA College of Engineering and Technology, Dindigul-624622 varkeesece@gmail.com

V.Magudeeswaran Assistant Professor, Department of ECE, PSNA College of Engineering and Technology, Dindigul-624622 magudeeswaran@gmail.com

Abstract- Recent developments in Low power technologies have offered high-density high-speed devices with the flexibility for custom computing and maintaining the pliability of a software solution. These features are well compatible to image processing algorithms that are computationally intensive and repetitive in nature. The very deep pipelining and parallelism, features usually needed for real time image analysis can be achieved easily simply exploitation hardware design. The Hough Transform could be a powerful and robust global image process tool for feature recognition and detection. The CORDIC algorithm uses simple addition and shifting operations to implement complicated trigonometric functions. This paper combines the novel CORDIC algorithm with Hough Transform to reduce power for real time embedded processor. Overall, the CORDIC based Hough Transform can leads to significant power savings from 51mW to 27mW and reduce delay.

Keywords: Coordinate Rotation Digital Computer (CORDIC), Hough Transform (HT), Transposition buffer, Low Power.

#### 1. INTRODUCTION

The continuous developments of digital signal processing applications, such as video based advanced driver assistance or music classification are used in the embedded systems which use a surface mount form factor and it consume less power. Telecommunications systems also use several embedded systems from telephone switches to mobile phones. These applications are involved in complex mathematical functions like trigonometric operations [1] [2] [3], square roots [4], logarithms and which are calculated by using CORDIC.

Simple and hardware-efficient algorithm is CORDIC, an acronym for Coordinate Rotation Digital Computer which

was first introduced by Volder [5] for compute trigonometric functions. Later, it was generalized to hyperbolic functions, multiplication, division etc., by Walther [6]. CORDIC uses only Shift-and-Add arithmetic [7] with table Look-Up to implement different functions. By making slight changes to the initial conditions and therefore the operation table values, it are often used to efficiently implement Exponential functions, Trigonometric, Coordinate Transformations, Hyperbolic etc.., using the same hardware. Since it uses only shift and add arithmetic, VLSI implementation of such an algorithm is easily achievable. It is normally used when no hardware multiplier is available such as, simple microcontrollers and FPGAs.

Typically CORDIC algorithm is used to implement linear transformations like Discrete Fourier Transform, Discrete Hartley Transform, Chirp-Z transform, digital filters like adaptive filters; Matrix based Digital Signal Processing algorithms [7][8] like QR factorization, singular value decomposition. The CORDIC algorithm is highly suited for Very Large Scale Integration (VLSI) implementation. Hough transform is widely used for detecting straight lines [9] in an image. The Hough transform is a feature extraction technique used in computer vision, image analysis and digital image processing.

The underlying principle of the Hough transform is that there are an infinite number of potential lines that pass through any point, each at a different orientation. This Hough transform is used to find imperfect instances of objects within a certain class of shapes by a voting procedure [10]. This voting procedure is carried out in a parameter space, from that object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform. For embedded application, CORDIC is used to achieve real-time implementation of Hough transform with minimum power consumption.

### 2. CORDIC BASED HOUGH TRANSFORMS

A block diagram of CORDIC based Hough architecture is shown in Figure.1. Coordinate Rotation Digital Computer (CORDIC) is a well known versatile approach. The Hough transform based CORDIC is hardware-efficient algorithms for the computation of trigonometric and other functions that use

x(0)X(0) x(1)CORDIC X(1) CORDIC Transposition based buffer based HT HT 1 X(7) x(7)Input Transdata or formed HT CORDIC signal data or signal

Figure.1.CORDIC based Hough Architecture

only shifts and adds to perform. Most of the Hough primarily based ways encounter the analysis drawback of implicit trigonometric and transcendental functions. This makes the monolithic implementation of the whole algorithm to overcome this problem, CORDIC based architectures are used to generate the vote address in parameter space.

This method is very effective because it avoids the multiplication term. Transposition buffer is used in between of CORDIC based Hough Transform. Transposition buffer obtain the input data from the Hough transform based CORDIC.

2.1. Work flow

The Hough transform work flow is shown in Figure.2. This Hough transform consist of three processes which are Processing Elements (PE), Accumulation and voting. In this Processing Element generate the Sum of Absolute Difference value (SAD).





Figure 2 Hough Transform

The Sum of Absolute Difference can be expressed as

$$SAD = \sum_{i=0}^{3} \sum_{j=0}^{3} \left| X_{ij} - Y_{ij} \right| \quad (1)$$

The Accumulation unit performs the two processes which are inter blocking increment and intra blocking increment. Inter blocking incrementing diagram is shown in Figure 3 and the Step table is introduced in order to skip the zero blocks. Step table receive the input from SAD. The inter blocking increment is implemented when  $N\sin\theta$  can be precomputed. The Col\_reg and Row\_reg are calculates the values every time after a block row processing is completed. In intra blocking incrementing, the computed values of inter block is perform the addition with corresponding cos $\theta$  and Sin $\theta$  values.



Figure 3 Inter blocking incrementing

The computed Value is divided into the integer part i0 and the fractional part  $f_0$  but only the fractional part  $f_0$  is used for calculating the vote-offsets as shown in Figure. 4. The first stage of Fig. 4 calculates the vote-offsets Vo<sub>i</sub> for the i<sup>th</sup> input. These vote-offsets Vo<sub>i</sub> are decoded by decoders as shown in the second stage.



Figure 4 Intra blocking incrementing

The outputs of the decoders area unit combined with the values of the corresponding input which using combination logic circuit to determine Vi, that represents the consolidated number of votes for every totally different vote-offset.



Figure 5 Vote consolidation

Between the approximate HT blocks a real-time row-parallel transposition buffer circuit is required. The transposition buffer block is shown in Figure.6. This block ensures data ordering for converting the row-transformed data from the first HT circuit to a transposed format as required by the column transform circuit.



Figure 6 Transposition buffer

## 3. RESULTS AND DISCUSSION

Figure.7 shows the simulation results of CORDIC based Hough Transform and it can be simulates by using Modelsim tool. The various parameters of CORDIC based Hough Architecture are computed by using Xilinx tool.



Figure 7 Simulation result of CORDIC based Hough Transform

In this Architecture the number of iteration is reduced to increase the speed than the CORDIC based DCT Architecture. The comparison table for Power, delay, junction temperature is shown in table.1.

| Methodology             | CORDIC based<br>DCT<br>Architecture | CORDIC based<br>HT Architecture |
|-------------------------|-------------------------------------|---------------------------------|
| Power<br>consumption    | 51(mW)                              | 27(mW)                          |
| Delay                   | 5.198                               | 2.758                           |
| Junction<br>temperature | 30C                                 | 26C                             |

Table 1 Comparision Table

## 4. CONCLUSION

The vector rotation and trigonometric computation provided by the CORDIC algorithm not only applies to Hough Transform, but are also used in other image processing applications such as graphics animation, Discrete Fourier Transform (DFT) etc., Thus in this CORDIC based Hough transform architecture provides very low power consumption for real time embedded processor. In which it is simulated using the Modelsim tool and the power analysis is showed by using Xilinx tool.

#### REFERENCES

- A Prateek, Magdum, R Bhagyalaxmi, J Honnakasturi, M Rudagi , (2014) "Performance analysis and FPGA Implementation of Radix-2 and Radix-4 CORDIC"International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 3, Issue 4, pg no:222-231.
- [2] Er. Lalit Bagga, Er. Manoj Arora, Er. R.S. Chauhan, Er. Parshant Gupta (2012), "Technology Roadmap of CORDIC Algorithm"International Journal of Engineering, Business and Enterprise Applications (IJEBEA), pg no:14-21.
- [3] Amit Singh, Sandeep D. Bhad (2013), "Analysis of Simple CORDIC Algorithm Using MATLAB", International Journal of Scientific & Engineering Research, Volume 4, Issue 6, June- pg no:132-134.
- [4] Zhong-Ho Chen, Alvin W. Y. Su, and Ming-Ting Sun, Fellow, (2012) "Resource-Efficient FPGA Architecture and Implementation of Hough Transform", IEEE Trans. Very Large Scale Integr. (VLSI) Syst., VOL. 20, NO. 8.
- [5] H N Srinivasa Murthy, M Roopa (2012), "FPGA Implementation of Sine and Cosine Generators using CORDIC Algorithm" International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-1, Issue-6, pg no: 16-19.
- [6] Ahmed Madian, Muaz Aljarhi (2013), "A Multi Cordic Architecture on FPGA Platform" World Academy of Science, Engineering and Technology International Journal of Electrical, Computer, Electronics and Communication Engineering Vol:7 No:12 pg.no.1233-1241.

- [7] Leena Vachhani, K. Sridharan, Pramod K. Meher (2009) "Efficient CORDIC Algorithms and Architectures for Low Area and High Throughput Implementation" IEEE TRANSACTIONS on Circuits And Systems—II: vol. 56, no. 1,pg no:61-65.
- [8] Xin Zhou, N Tomagou, K Nakano (2013) "Efficient Hough Transform on the FPGA using DSP Slices and Block RAMs" IEEE Conference Publications, Page(s): 771 – 778.
- [9] K Lakshmi Priya, (2014) "Efficient Implementation of Reconfigurable MIMO Decoder Accelerator Chip" International Journal of Innovative Research in Science, Engineering and Technology, Vol. 3, Issue 5, pg no: 12244-12250
- [10] U Vishnoi and T G Noll (2012) "Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies" Copernicus Publications on behalf of the URSI Landesausschuss, vol 10, pg no:207–213.